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Abstract 

We present fast, spatially dispersionless and unconditionally stable high-order solvers for Partial Dif- 
ferential Equations (PDEs) with variable coefficients in general smooth domains. Our solvers, which 
are based on (i) A certain "Fourier continuation" (FC) method for the resolution of the Gibbs phe- 
nomenon, together with (ii) A new, preconditioned, FC-based solver for two-point boundary value prob- 
lems (BVP) for variable-coefficient Ordinary Differential Equations, and (iii) An Alternating Direction 
strategy, generalize significantly a class of FC-based solvers introduced recently for constant-coefficient 
PDEs. The present algorithms, which are applicable, with high-order accuracy, to variable-coefficient 
elliptic, parabolic and hyperbolic PDEs in general domains with smooth boundaries, are unconditionally 
stable, do not suffer from spatial numerical dispersion, and they run at FFT speeds. The accuracy, effi- 
ciency and overall capabilities of our methods are demonstrated by means of applications to challenging 
problems of diffusion and wave propagation in heterogeneous media. 

Keywords: High-order methods, Alternating Direction Implicit schemes, numerical dispersion, variable coef- 
ficient problems. 

1 Introduction 

We present fast, spatially dispersionless and unconditionally stable high-order solvers for Partial Differential 
Equations (PDEs) with variable coefficients in general smooth domains. Our algorithms, which general- 
ize significantly a class of solvers [51 [H] introduced recently for constant-coefficient PDEs, are based on 
(i) A certain "Fourier continuation" (FC) method [5] for the resolution of the Gibbs phenomenon, to- 
gether with (ii) A new, preconditioned, FC-based solver for two-point boundary value problems (BVP) 
for variable-coefficient Ordinary Differential Equations (ODE), and (iii) The Alternating Direction Implicit 
( ADI) methodology [H [14] . One of the main enabling elements in our overall FC-AD algorithm (Fourier- 
Continuation Alternating-Directions) is the new solver (ii) for two-point boundary value problems with 
variable coefficients. Relying on preconditioners that result from inversion of oversampled finite-difference 
matrices together with a new methodology for enforcement of boundary conditions and the iterative linear 
algebra solver GMRES, this algorithm produces rapidly the solutions required for FC-AD time-stepping in 
the variable-coefficient context. (A non-oversamplcd finite-difference preconditioncr related to but different 
from the one used here was introduced in [15] in the context of orthogonal collocation methods for equations 
with constant coefficients.) The resulting PDE solvers, which in practice are found to be unconditionally 
stable, do not suffer from spatial numerical dispersion and they run at a computational cost that grows 
as 0(N log 2 N) with the size N of the computational grid. A variety of examples presented in this paper 
demonstrate the accuracy, speed and overall capabilities of the proposed methodology. 

The variable-coefficient FC-AD algorithms introduced in this paper enjoy all the good qualities associ- 
ated with the constant coefficient solvers presented in [BJ : the performance of the new solvers compare 
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favourably, in terms of accuracy and speed, with those associated with previous approaches; a detailed dis- 
cussion in these regards can be found in the introductory sections of [HMH]. In particular, in this paper we 
demonstrate the high-order, essentially dispersionless character of the new FC-AD solvers by means of solu- 
tions to parabolic and hyperbolic problems. For example, the results presented in Section 7.3. 1| which include 
FC-AD fixed-accuracy solutions at fixed numbers of points-per-wavelength for problems of sizes ranging from 
one to one-hundred wavelengths in size, demonstrate the spatial dispersionlessness of the FC-AD algorithm 
for hyperbolic equations. Further, as shown in Table [T] for example, the proposed FC-AD algorithm can 
evolve a solution characterized by one-million spatial unknowns in a computing time of approximately 1.5 
seconds per time step in a single-core run. 

The remainder of this paper is organized as follows: after the introduction in Section 2.1 of the variable- 
coefficient PDEs we consider, Section |2.2| details the ADI approximations we employ. Section [3] presents 
the FC method, including, in Section [3~3} a special version of the FC algorithm we need to tackle variable- 
coefficient differential equations. Sections [4] and [5] then describe our new FC-based solver for two-point 
BVP, which include 1) Iterative FC-based solvers for periodic ordinary differential equations in a certain 
"continued" periodic context, and 2) Techniques that enable enforcement of boundary conditions in presence 
of either sharp or diffuse boundary layers. The combined FC/ADI scheme is described in Section [6] The 
overall properties of the resulting FC-AD PDE solver, finally, are demonstrated in Section[7]through a variety 
of numerical results. 



2 Preliminaries 

2.1 Variable coefficients PDEs 

This paper presents FC-based solvers for linear equations containing time-independent but spatially variable 
coefficients. While the methods we present are applicable to any partial differential equation for which an 
ADI splitting is available, for definiteness we focus on the basic variable-coefficient parabolic and hyperbolic 
problems 

ad t u — div(/3gradii) = / in Q X (0, T), 

u = g ondOx(0,T), (1) 

u = uq in £1 x {0}, 



and 



adfu — div(/?gradu) = / in Q x (0, T), 



u = g on aOx(0,r), 

u = u Q in il x {0}, 

dtu = m in ft x {0} 

in a bounded open set f2 with smooth boundary <9f2 and within the time interval (0,T). Here a, P, uq, 
Ui, f and g are suficiently regular functions defined in f2; additionally the coefficient functions a and p are 
assumed to satisfy the coercivity conditions 

«o < a < oi\ in fi, 

Po < ft < fti in Q 

for some positive constants o>q, «i, Pq and Pi, while the initial and source functions are required to verify 
the relevant compatibility conditions in <9f2 x {0}, namely, uq = g for the parabolic problem ([!]) and uq = g, 
Ui = dtg for the hyperbolic problem pi). 

2.2 Alternating Direction schemes 

This section presents the ADI splitting schemes we use for the solution of the PDE problems ([T]) and ([2|. 
These splitting schemes result as adequate generalizations of the Peaceman-Rachford scheme [T3] for diffusion 
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problems and the non-centered scheme [12j for the wave equation to the present variable-coefficient context. 
In what follows At > denotes the time step used in the computational time interval [0, T]; it is assumed 
that n max Ai = T for a certain positive integer n max . Letting t n — nAt for integer and even fractional values 
of n (e.g., t n+ i = (n + |)At), we have, in particular, £„ max = T. 

2.2.1 Diffusion equation 

Given the initial values u , the right hand-sides f n+ i(x,y) — f(x,y,t n+ i) and f n+ i{x,y) — f(x,y,t n+ 3), 
and the boundary values g n+1 (x,y) = g(x,y,t n+ i), the exact solution <fi n (x,y) = u(x,y,t n ) for n = 
of the diffusion problem ([lj satisfies the Crank-Nicolson relation [SJ [T3] 

a — divl^grad 1= + C(At z ) in ft, 

0° = w in ft, ( 3 ) 

n+1 = in 9ft. 

Following [14] , an ADI scheme can be obtained from the Crank-Nicolson iteration: denoting by u n+1 the 
corresponding approximation of <fi n+1 (for integer values n — 1,2,...) and using the intermediate quantity 
M n+ 5, the ADI scheme is embodied in the equations 

u n+h -^ dx (pd x u n+ ^)=u n + ^d v (0d y u n ) + ^f n+ i in ft, (4) 
la la la 

(with boundary condition u"+5 = g™+2 on 9ft) and 



+ ' - ^-d y ^d y u n+1 ) = + ^d x ((3d x u n+ i) + ^/"+t in ft, (5) 



(with boundary condition it' i+1 = g n+1 on 5ft); cf. [6]. 

An algorithm based on the ADI iteration can be conveniently obtained as a sequence of four 

operations involving two additional auxiliary quantities w n and w n+ 5 : 

(Dl) Initialize u° and w° as 

u° = uq in ft, 

w° = ( 1 + A^c\ + At^-S^ u° in ft. 
\ 2a 2a y J 

Then, for n = 0, 1, . . . , n max , 

(D2) Obtain u n+ 2 by solving the boundary value problem 

i+ 2 = <7™ + 2 on ft. 

(D3) Update w n+ ^ according to 

w n +h =2u n +5 - w n - — P+i in ft, 
2a 

and, finally, 

(D4) Obtain u n+1 by solving the boundary value problem 

u n+l = g n+l on ^ 
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The ADI scheme thus requires solution of the one-dimensional boundary value value problems (D2) and 
(D4) (see Section [3| and solution updates (Dl) and (D3). Each one of these operations involves differential 
operators with respect to a single spatial variable — as it behooves an ADI discretization |13j . 

Remark 1. An implementation of this algorithm for the case in which the coefficients a and j3 are constant 
was put forth in [B]. In that reference it was noted that, for n € N, u n+1 is a globally second order accurate 
approximation of the exact solution of the diffusion equation ([I]): the error at any fixed time step t = t* 
is a quantity of order O(At) 2 . This is in spite of the approximation u n+ ? = g™ + 5, which is necessary to 
enable applicability to complex domains, and which induces a second-order local truncation error in step 
(D2); see [SJ Remark 3.2] for details. Numerical experiments we present in this paper reveal once again a 
second-order global error in the solutions resulting from the scheme above. As shown in reference [5] and 
Section |7.2[ further, solutions of higher order of temporal accuracy can be extracted from the solutions 
produced by the ADI scheme above by means of the Richardson extrapolation methodology. 

2.2.2 Wave equation 

To derive our Alternating Direction scheme for the linear wave problem ([2]), in turn, we follow [12] and note 
that the exact solution <fi n (x,y) = u(x,y,t n ) satisfies the discrete relation 

~ ' ' V div(/3grad</>™ +1 ) = .T+5 + O(At) in ft, 



At 2 

in ft, (8) 

1 = u + Atm + O(At) in ft, 

"+ 1 = g n+1 in 5ft, 

where, once again, f n+ ^(x) — f(x,y,t n+ i) and g n+1 (x) — g(x,y,t n+ i). Following [T^] we split the stiffness 
term and thus obtain the ADI time-stepping scheme 

(Wf ) Initialize u° and u 1 as 

u° — uq in ft, 

u 1 = uq + Atui in ft. 

Then, for n = 0, 1, . . . , n max , 

(W2) Obtain w n+ ^ by solving the boundary value problem, 

1-At 2d ^d x -At 2 ^dl)w n +^ = 2u n - u"" 1 + — f + * in ft, 

a a J a (9) 

(W3) Obtain u n+1 as the solution of the boundary value problem 



a 

u n+l = g n+l on an> 



1 - At 2 ^d v - At 2 ^d 2 ) u n+1 = w n+l i in ft, 

a a v 



Remark 2. Note that the expressions (D2)-(D4), and (W2)-(W3) of the alternating-direction ODEs for the 
heat and wave equations incorporate a division by the lowest-order variable-coefficient a in equations ^ 
and (Tsl) . This is an essential detail of our algorithm: if such a division by a is not incorporated in the 



algorithm, the iterative GMRES solution of these ODEs (which is presented in Section 5.1| would require 
large numbers of iterations. In the divided form, in contrast, the matrix of the linear-algebra problem is 
close to the identity for small At, and small numbers of GMRES iterations suffice to yield highly accurate 
ODE solutions. In fact, we have found in practice that the divided ODE forms give rise to small numbers of 
iterations even for large values of At. 
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Remark 3. It is easy to check that the scheme (W1)-(W3) is first order consistent. We refer to [321 Section 
5] for a discussion of the global order of accuracy of the algorithm; in practice, and in agreement with that 
reference, we find the algorithm produces solutions with global first order accuracy. As shown in reference [B] 



and Section 7.3.2 further, solutions of higher order of temporal accuracy can be extracted from the solutions 
produced by the ADI scheme above by means of the Richardson extrapolation methodology. 

Problems (D2), (D4), (W2) and (W3) amount to two-point BVP of the form 

u — pu — qu" = / in (a, b), (10) 
u(a) = d a , u(b) = db, (11) 

where the right-hand side / and the variable coefficients p and q are bounded smooth functions defined in the 
interval [a, b], with q > r\ > for some constant rj. More precisely, the ODE coefficients and the right-hand 
sides of the problems (D2) and (D4), (W2) and (W3) are given by 

p(x) = P H (x,y), q(x) = Q(x,y), f(x) = F(x,y,t) in problems (D2) and (W2) (y,t fixed), and 
P(y) = ^fay), g(y) = Q(x,y), f(y) = F(x,y,t) in problems (D4) and (W3) (x,t fixed), 

(12) 

where for (D2) and (D4) 
while for (W2) and (W3) 

V»{x,y) = A* 2 M^, V-(x,y) = At^M^, Q(x,y) = At»^, F{x t y t t) = A^^f. 

a{x,y) a(x,y) a(x,y) a{x,y) 

(14) 



To solve the two-point BVP (10l-(ll) numerically we utilize a discrete method based on three main 
elements: 1) The Fourier Continuation method (see Section |3| to produce accurate Fourier-series approx- 
imations of the ODE source terms and variable coefficients; 2) A specialized Fourier collocation method 



for solution of ODE boundary value problems (see Section 4.1) — which, in view of item 1), can be applied 
to non-periodic boundary-value problems without the accuracy degradation associated with the Gibbs phe- 
nomenon; and 3) A new strategy for the enforcement the boundary conditions in the context arising from 



items 1) and 2) (see Sections 4.2 and 4.3 1. 



3 Fourier Continuation and scaled FC(Gram) 



3.1 Discrete Fourier analysis background 

3 O 

discrete Fourier series of v £ Cp er (a, c) is given by 



Let Cp 0r (a, c) denote the space of /c-times continuously differentiable periodic functions of period c — a. The 



Jv[x)= ^2 ^e i27r "-« a) £ B(a, c). 

n£T(F) 

Here T(F) = {n £ N : -F/2 + 1 < n < F/2} for F even and T(F) = {n £ N : (F- l)/2 < n < -(F- l)/2} 
for F odd, 

- . 2Tn(x-a) I 

g : (a, c) -> K such that g(x) = .9" e * c_a ( ( 15 ) 

ner(F) I 
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denotes the -F-dimensional space of trigonometric polynomials, and the amplitudes v n are given by the 
trapezoidal-rule expression 

f 

1 J-., , 2^(^-0) 
in = pl^ v ( x i) e 1 c ' a > neT(F). 
3=1 

Note that, for conciseness, the "degree" F is not explicitly displayed in the notation B(a, c). 
We point out that, as is well-known [21 111). 

1. An element of the set B(a,c) is determined uniquely by its values at the equispaced grid a — x\ < 
. . . < Xf = c — h, where 

xj = a + (j — l)h, j = 1, . . . , F, h = (c — a)/F, and, (16) 

2. The discrete Fourier operator J is an interpolation operator from C^ eI (a, c) into B(a,c), that is, 
Jv(xj) = v{xj) for all j = 1, . . . , F. 

3.2 Fourier Continuation and FC(Gram) 

The Fourier Continuation method [JJ[B] is an algorithm which, acting on a set of values f(x±), . . . , /(xjv) of a 
function / : [a, b] M at iV equidistant points Xj € [a, b], produces a trigonometric polynomial f c € B(a, c), 
of a given degree -F and on a given interval [a, c] D [a, 6] with b < c, which approximates the function / 
closely on the interval [a,b]. Note that the endpoints a and b are not required to belong to the Fourier 
grid {xj}jL 1 . Clearly, for [a, 6] = [a, c], the trigonometric interpolation operator J leads to a naive Fourier 
Continuation procedure which gives rise to the well known Gibbs ringing near x = a and x — b — unless / 
is a smooth periodic function of period (6 — a). The selection of a larger expansion interval [a, c] allows for 
a smooth transition between the values f(Xj) near j = N to the values f(xj) near j = 1, and thus enables 
highly accurate Fourier approximation in the interval [a, &], avoiding the undesirable Gibbs ringing effect and 
related accuracy deterioration. 

A number of algorithms has been introduced which provide accurate Fourier continuations for smooth 
non-periodic functions (including, for example, the FC(SVD) method [6!, which relies on Singular Value 
Decompositions and the related algorithms [31|S][7])- Use of the spectrally accurate Fourier continuations 
as a component of efficient PDE solvers, however, must rely on correspondingly efficient Fourier Continu- 
ation algorithms. For that purpose, an accelerated FC method, the FC(Gram) procedure, was introduced 
recently [H [T5]. In this section, we present a brief description of FC(Gram) method; full details can be 
found in [B]. A new "scaled" version of the FC(Gram) approach, which is necessary for the applications 
presented in this paper, is introduced in Section |3.3| 

For the present description of the FC(Gram) method, without loss of generality we assume [a, b] = [0, 1], 
as in references [6JQ2]. The FC(Gram) algorithm is based on the A^a left-most and N<\ right-most grid 
values f(xj) of / within the subintervals [0, A] and [1 — A, 1], with A = hN&. Fixed the number of extension 
points (belonging to [1, 1 + d\ with d = h(Nd — 1)) where f c (xj) has to be computed, the continuation 
problem is reduced to seek a smooth periodic function f c in [a, c] = [0, 1 + d]. For adequate reference note 
that our quantities N, N/\ and Nd are denoted by n, ua and in the contribution [SJ. 

To compute / c , the left-most grid values of / in [0, A] are mapped in [1 + d, 1 + d + A]. Then, they 
are projected on certain Gram bases of orthogonal polynomials — with orthogonality dictated by the natural 
discrete scalar product defined by the grid points. Finally, using the orthogonal polynomial expansion, a 
periodic matching function / ma tch, in the interval [1 — A, 1 + 2c? + A] is computed. This function consists of 
a sum of contributions of each polynomial projection (up to degree m) and blends smoothly the values at 
the left and right subintervals, [1 — A, 1] and [1 + d, 1 + d + A]. The grid values of / ma tch in [1, 1 + d] give 
the required grid values of f c . This procedure can be efficiently implemented by precomputing the sets of 
matching functions {/evenl^Lo ano - {/oddi^Lo associated to the even and odd pairs of the polynomial basis 



G 



in the interval [0, 1] and then, each FC(Gram) continuation in an arbitrary interval can be obtained by using 
an affine transformation (see [SJ Section 2.3] for details). 

It is clear that the accuracy of the Fourier continuation operator depends on the number TV of Fourier 
grid points contained in [0, 1], the maximum degree of polynomials m used in the polynomial projections, 
the number TVa of right and left grid points, and also on the number N4 of extension points outside the 
interval [0, 1] (see item [2] in Section [7] for an indication on the actual values of these parameters used in our 
numerical examples). 



3.3 Scaled FC(Gram) algorithm 

The FC-based PDE solvers presented in this paper rely on the FC methodology as a tool for solving two-point 
BVPs for ordinary differential equations with variable coefficients. As discussed in Section [573] to obtain an 
accurate solution of the relevant two-point BVPs for a given PDE problem in the present variable-coefficient 
context, our method requires, unlike previous FC-based approaches, use of numbers N g of extension points 



and associated extension intervals of length g that vary (linearly) with N (see Section 5.2). Use of such 
iV-dependent extension intervals in the procedure described in Section |3.2| entails evaluation of correspond- 
ingly TV-dependent sets of matching functions {/e Ven }^L and {/oddJvlo — a procedure that is inelegant and 
computationally expensive. 

To overcome this difficulty, a slightly modified "scaled" version of the FC(Gram) algorithm is introduced 
in this section. This approach is based on use of a set of precomputed matching functions /match, as 



described in Section 3.2 in a fixed interval [1, 1 + d] and for a certain fixed value Nd- To obtain inexpensively 
a matching function at a new number N g of extension points in a new interval [1, 1 + g] (where it is assumed 
that g/Ng = d/N d ), the "scaled" procedure uses as a matching function a composition of the form 

/match °£, (17) 

where £ : [1 — A, 1 + g + A] — » [1 — A, 1 + d + A] is a smooth diffcomorphism. To preserve the point values of 
/match in [1 — A, 1] and [1 + d, 1 + d + A], the scaling function £ is assumed to be map linearly the intervals 
[1 — A, 1] and [1 + g, 1 + g + A] onto [1 — A, 1] and [1 + d, 1 + d + A], respectively. Throughout this paper 
we use the scaling function 

{(.) - 1 - A + (1 + 2A)fac ( ' + <" - "*«^" ~ ' + A ) »E[l,l +9 ], (18) 

where, letting \x\ denote the largest integer less than or equal to x, we have set frac(a;) = x — \ x\ , and the 
auxiliary function <p is given by 



(s) = { 



1 1 



if s > 1, 



l + exp^--_ )) if, e [0,1], 
if s < 0. 



In what follows, the scaled FC(Gram) continuation of a function / resulting from this procedure will be 
denoted by either of the following symbols 

f = E(f) = E g (f). (19) 

The first two of these notations can of course be used only when the value of g is either inconsequential or 
implicit in the context. The scaled FC(Gram) algorithm produces Fourier continuations for varying values 
of g and N g at a computational cost that is essentially the same as that required by the original FC(Gram) 
procedure for a fixed number Nd of extension points. 
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Remark 4. Once the even and odd continuation of the Gram polynomials are computed and stored in 
memory, the FC(Gram) and scaled FC(Gram) continuations of any function only involve the computation 
of m inner products of size N/\, which requires 2m sums and products, and the evaluation of a trigonometric 
interpolant, which is performed by means of an FFT of size N + N g — 1. Since m and are independent 
of N, both the scaled and un-scaled versions of the FC(Gram) algorithm run at a computational cost of 
0((N + N g - 1) log(iV + N g - 1)) operations. 



4 FC BVP solver I: particular solution and boundary conditions 



This section presents an FC-based method for solution of variable-coefficient BVPs of the form (10)-(11| 



As explained in what follows, this approach relies on the scaled FC(Gram) method to produce a periodic 
extension of the non-periodic problem ( |10| ) — ( 11 ) to an interval (a, c) D (a, b), with boundary values at the 
endpoints of the original interval (a,b). As detailed in Section 4.1 a particular solution for equation (10) 



can easily be produced on the basis of the FC(Gram) method. The enforcement of the boundary conditions, 
which requires some consideration, is presented in Sections |4.2| and |4.3| In the first one of these sections a 



direct numerical approach is presented for the evaluation of solutions of the homogeneous problem ( 10 1— ( 1 1 ) 
(/ = 0) with non-zero boundary values, which can be used to correct the boundary values of a particular 
solution. In Section |4.3| a complementary approach is introduced, which can effectively treat challenging 



boundary layers and stiff equations that arise as small time steps and correspondingly small coefficients q(x) 
are used (cf. equations ([6]), ^ and 



4.1 FC-based particular solution 



To obtain a periodic embedding to the interval (a, c) D (a, b) of the BVP (10)-(11| wc utilize Fourier 
continuations of the functions /, p and q. To preserve the ellipticity of the problem, however, we must 
ensure that the extension used for the coefficient q takes on strictly positive values. To produce such strictly 
positive extension of a positive function q we utilize a infinitely differentiable diffeomorphism r\ : K — > {C±, C2) 
such as, for instance, r](s) = C\ + [pi — Ci)(l + arctan(s))/2. Using such a diffeomorphism we define certain 
"limited extensions" of a positive function q by means of the expression 



q e = r/o E(i] 1 oq), 



(20) 



where E denotes the Fourier continuation procedure defined in (19 1. The expression (20 1 ensures that the 
values of the periodic extensions <f are bounded by above and below by the positive values C\ and C2, 



(21) 



respectively. Hence, the "periodic embedding" of the ODE problem (|10|)-( 11 ) is given by 

u — pvf — q l u" = f in (a, c), 



where, in accordance with equation (19), p and / denote the scaled FC(Gram) continuation of the functions 
p and /. 

To produce approximate periodic solutions of period c — a of the ODE problem ( 10 ), our algorithms relics 
on Fourier collocation: a numerical solution of the form 



uj(x) 



neT(N+N d -l) 



u n e 



e B(a, c) 



is sought that satisfies ( 10 ) at each one of the grid points Xj, j = 1, . 

duf d 2 Uf 
Uf{xj) - p{xj) — {xj) - q (xj)^J-(xj) = f(xj) 



dx 2 



,N + N d -1, that is 



for j = l,...,N + N d -l. 



(22) 



(if problem ( 10 l-flT]) has a unique solution, this system of equations is not singular, see e.g. [2].) Since the 
solution uj is determined uniquely by its values u^{xj) for j = 1, ... ,N + N d — 1 the M Ar+A ' <1_1 -vector of 
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point values of Uf(xj), j = 1, . . . , N + Nd — 1 are used as the unknowns of the problem. This linear system 
of equations is solved by means of the iterative solver GMRES; see Section [53] for details. 



Remark 5. Note that, since the coefficients p and q are variable, the matrix associated to the linear system 
(22 1 is not sparse. To avoid the expense associated with a direct solution of this linear system we utilize a 
preconditioned iterative solver, as described in Section [5j 

Clearly, the particular solution described in this section does not generally satisfy the boundary conditions 
imposed in (11). The necessary corrections are described in the following section. 



4.2 Boundary conditions I: exterior sources 



To enforce the necessary boundary conditions in (111 we consider two auxiliary boundary value problems 



involving ODEs of the form (10) but with certain adequately-chosen right-hand sides g a , gb £ B(a,c). The 
right-hand sides g a and gb are taken to vanish at all discretization points in the original interval (a, b) but 
not to vanish in (6, c) — that is to say, the selected right-hand sides correspond to sources supported in the 
exterior of the physical domain [a, 6], see e.g. Figure [2] right. Clearly, any linear combination of the form 
/ + X a g a + Xbgb with A a , Xb € K coincides with / in the part of the spatial grid contained in (a, b). Calling 



u a and Ub £ B(a,c) the approximate FC solutions (as described in Section 4.1) corresponding to g a and gb, 
it is clear that any linear combination of the form X a u a + XbUb satisfies the ODE ( fl0| with null right-hand 
side at each one of the collocation points Xj £ (a, b). It follows that, denoting by (Ao jai Ao,b) and (Ai „, Ai &) 
the solutions of the 2x2 linear systems 



u a {a) lit (a) 
u a {b) u b (b) 



Ao,a 

Ao.fc 



and 



u a (a) u b (a) 
u a (b) u b (b) 



Al, a 

Xih 



(23) 



(as shown below, the system matrix is invertible for sufficiently fine discretizations provided the functions 
g a and gb are chosen appropriately), the functions w a — Xo, a u a + Xo^Ub and Wb — Xi. a u a + Xi^Ub are 



approximate FC solutions of ( 10 1 with null right-hand side, which satisfy the Dirichlet boundary conditions 
Wa(a) = 1, w a (b) = 0, Wb(a) = and Wb{b) = 1. Thus, the Fourier series 



u N = uj + (d a - Uj(a))w a + (d b - Uj(b))w b 



(24) 



conditions (11) 



is an approximate FC solution of ODE problem ( 10 ) that satisfies exactly the prescribed Dirichlet boundary 



The needed invertibility of the system matrix in equation ( 23 ) for sufficiently fine discretizations indeed 
holds provided the functions g a and gb satisfy the conditions 



g a (x)dx > 0, 



gb(x) dx = 0. 



(25) 



To establish this fact we let U a and Ub be the exact (c — a)-periodic solutions of equation ( 10 ) with respective 
right-hand sides g a and gb-, and, using Lemma [T] presented in Appendix [XJ we show at first that the "exact- 
solution system matrix" , that is, the matrix that results as u a and Ub are substituted by U a and Ub in 
equation ( |23[ ), is invertible. Indeed, if the exact-solution system matrix is singular then the U a and Ub 
satisfies U a (a) + fJ,Ub{a) — and U a (b) + [iUb{b) = for a certain /igl. The function V = U a + ^Ub is then 



a solution of the two-point BVP ( 10 )—(!!) with / = 0, d a = and db = 0, and thus, it vanishes identically 



since, in view of Lemma [T] this problem admits a unique solution. It then follows by the smoothness and 
periodicity of V that V(b) = V'(b) = and V(c) — V'(c) — 0. But this is a contradiction, since Lemma [l] 
tells us that such solutions do not exist under the hypothesis (25). Hence, the exact-solution system matrix 



is invertible. The invertibility for the approximate-solution system matrix (23) for sufficiently fine grids then 



follows from the convergence of the FC solutions u a and Ub to U a and Ub as the grid-size tends to zero (see 
Remark [6| . 
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Remark 6. Stability and convergence proofs, which are beyond the scope of this paper, are left for future 
work; here we merely note that convergence of the FC solutions to exact solutions is amply demonstrated 
by the numerical results presented in Sections [5] and [6] 

Remark 7. For the numerical experiments presented in this work we have used the auxiliary right-hand sides 

g a = o ip, 

max \ip(x)g a (x)\ ' 

xe(b,c) 

where ip(x) = 2x — (b + c)/(c— b) and cj)(x) = exp(l — 1/(1 — x 2 )); see e.g. Figure [2j As required by Lemma[l] 
in Appendix [A] the functions g a and gb satisfy the assumption (25), and their supports, which are contained 
in the interval (6, c), do not intersect the interval (a, 6). 

Remark 8. Note that the solution u and its numerical approximation un generally possess boundary lay- 



ers [TJ Ch. 7] at the interval endpoints a and b in cases in which the coefficient function q in equation ( 22 ) 
is small — which, in view of equations ([6]), Q and j9j|, it certainly is for ODE problems arising from the FC 
PDE solver for small values of At. The presence of such boundary layers can be appreciated by considera- 



tion of equation ( 24 ) : the functions w a and Wb provide (numerical approximations of) the boundary-layer 
contributions. 

Remark 9. The numerical boundary-layer FC solutions w a and Wb mentioned in Remark [8] may be affected 
by Gibbs-like ringing errors near the endpoints unless the underlying grid adequately resolves the exact 
boundary layers. To ensure the detection of a non-resolved boundary layer (so that a procedure can be used 
to guarantee that the discretization errors are not inherited by w a and Wf, in stiff problems), the grid size h 
is compared to the quantities \J q l (a) and yfq l {b). If the grid-size h is of the order of or smaller than these 
quantities, the boundary layer is well-resolved by the numerical approximations w a and Wb and no further 
action is needed. Otherwise, if h is larger than \J q e (a) and ^q f '{b), then w a and Wb do not adequately 



resolve the boundary layers, and an alternative treatment is necessary (Remark 16 presents details on the 



actual thresholds used in our numerical examples). Such an alternative method, based on use of asymptotic 



expansions, is presented in Section 4.3 



4.3 Boundary conditions II: High-order asymptotic-matching expansions 

As noted in the introductory paragraph of Section |4j stiff ODEs and challenging boundary layers arise in the 
solutions of the boundary value problem ( 10 )-(|TT|) as small time steps are used. In this section we present an 
algorithm that can resolve such boundary layers, without resort to unduly fine spatial meshes, on the basis 
of the method of matched asymptotic expansions [1 ] . For our development of this asymptotic procedure, we 
introduce a small positive parameter e (that is taken to equal At in equation ^ and yj At/2 in equations ^ 



and (|7|), we let p — e 2 po and q = e 2 qo, and we re-express equation (10) in the form 

(Lu)(x) = r(x)u(x)-e 2 -^ {s{x)^{x)j = f(x) , x € (a,b), (26) 
u(a) = d a , u(b) = d b , (27) 

where / and the variable coefficients s and r are bounded smooth functions defined in the interval [a, b] 
(which satisfy the conditions s, r > r\ > for some constant 77) given by 

1 ds s(x) 

Po{x) = -rTTrW' <lo{x) = - n -- 

r[x) dx r[x) 

Further, using the change of variables 

y{x) = [ -^rdr, xe (0,6), (28) 
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equation ( 26 ) can be re-cast in the simpler form 



2 d 2 u 



Lu(y) = f(y)u(y)-e z ^(y) = 0, y € (0,b), (29) 

where f(y(x)) = r(x)s(x), b = y(b) and ii(y(x)) — u{x) 

In what follows we produce, on the basis of the method of matched asymptotic expansions [T], approximate 



solutions w a , £ and Wb. e of the homogeneous (/ = 0) version of equation (26), satisfying w a _ e (a) — 1, w aiS (b) 



0, Wb, E (a) — 0, Wb, s (b) = 1; like the functions w a and Wb introduced in the previous section, w a ^ e and Wb lS can 



be used to correct the boundary values of a periodic FC solution u of ( 26 ) . Unlike the functions introduced in 
the previous section, however, the asymptotic-expansion solutions w a , e and Wb j£ produce accurate solutions in 
the small e regime without requiring use of fine meshes, and they can be evaluated with a fixed computational 
cost for arbitrarily small values of e. 

Remark 10. As mentioned above in this section, the small parameter e included in the variable coefficients p 
and q corresponds to the quantities At and \J At/2 in the time discretization of the wave and heat equations, 
respectively. Hence, the uniform convergence of the ODE solutions as h — > with respect to e guarantees 
the spatial convergence of the FC-AD scheme independently of the value of At used in the time-marching 
schemes. 

We lay down our procedure for evaluation of w a (x) (left boundary layer); the corresponding method for 
Wb(x) (right boundary layer) is entirely analogous. Using the change of variables y — y(x) we define a new 



unknown w = w(y) by w(y(x)) = w a {x)\ clearly w(y) satisfies equation (29) together with the boundary 
conditions 

w(0) = 1, w(b) = 0. 

To obtain the approximate solution w(y) for small values of e we use outer and inner solutions in the 
corresponding matched asymptotic expansion pQ associated with the left boundary layer. Setting e = in 
the ODE operator L and using the null boundary condition at y = b we see that the lowest order term in 
the outer asymptotic expansion vanishes. In fact, as it follows from the discussion below, the outer solution 
vanishes to all orders: 

Wout(y) =0, ye (0,6). 

In the inner region, in turn, we use the scaled variable Y = y/e and we express the inner solution Wi n (Y) ~ 
w(sY) and the ODE coefficient r as asymptotic expansions 

k k 

w in (Y) = £ j Wj{Y), f(sY) = ^ e j fjY j . (30) 



In particular, the Taylor expansion of the ODE coefficient f(sY) induces a corresponding expansion for the 

k 

i=0 ( 



ODE operator L, namely L = X^=o eJ A?' wnere 



d 2 

L o = -j^+ r o, L 3 ; = fjY 1 , j = 1, . . . ,k. 

It follows that the functions Wj are solutions of the following ODE problems: 

f L o w o (Y)=0, Fe(0,oo), 

^o(0) = 1, (31) 
lim w Q (Y) = 

Y— >-oo 



n 



for j = 0, and 



LjWjiY) = L m Wj- m (Y) = Y f m Y m w ] _ m {Y), 

m — 1 rn — 1 

^(o) = o, 

lim wAY) = 

s Y — s-oo 



re(o,oo), 



(32) 



for j = 1, . . . , k. It is easy to check that Wq(Y) — exp(— y/r^Y), that, similarly, wj can be expressed in closed 
form in terms of adequate combinations of exponentials and polynomials (see Algorithm [I]), and finally, that, 
as claimed above, all terms in the outer asymptotic expansion vanish. 
In sum, we have 

'y( x Y 



k 
3=0 



(33) 



where the functions Wj satisfy (31) and (32); a corresponding asymptotic boundary correction function w b , s 
can be obtained similarly. The maximum errors resulting from these approximations satisfy \\w a — w atS \\ = 
0(e k+1 ) and \\wb — ^6,e|| = 0(e k+1 ) for small e. Using these asymptotic boundary correction functions 



we then obtain an approximate FC solution of ODE problem (10), which satisfies exactly the prescribed 



Dirichlet boundary conditions (11), by means of the expression 



(34) 

Remark 11. Except for the accuracy tests presented in Section |5.3| all numerical examples presented in 
this paper use asymptotic expansions of order k = 3 for evaluation of w a ^ £ and Wb, £ whenever asymptotic 
expansions are needed for evaluation of boundary corrections. The selection of the value fc = 3 is based, 



precisely, on a series of numerical tests presented in Section 5.3 



We conclude this section with a description of the algorithm we use for evaluation of the terms in the 



asymptotic series (33). At first a Taylor series centered at y — for the ODE coefficient f is obtained, cither 



from a closed form expression or by numerical differentiation of an associated Fourier continuation series: 
the derivatives needed to evaluate 



Mr 



(0) = s(x) 



dx 



(r(x)s(x)) 



for j = 0, . . . , fc. 



(35) 



can be accurately produced by differentiation of limited extensions provided by the scaled FC(Gram) proce- 
dure, that is, by replacing r and s by f l and s l , respectively (see Subsection 4.1 ). Hence, only the grid values 



of the ODE coefficients are necessary to approximate the Taylor coefficients. This numerical differentiation 
procedure was used for all of the examples requiring use of asymptotic expansions in Sections |5.3| and [7 

Additionally, it is also necessary to provide an accurate numerical approximation of the primitive ( 28| 
that defines the change of variables y(x). Our algorithm produces this primitive by relying on the Fourier 
continuation s . Using this function we obtain the Fourier series 



4^ 



E 

n£T{N+N d -l) 



cr„e 



e B(a, c) 



(by means of a Fast Fourier transform (FFT)), from which integration is straightforward: 

Vn{x) 



(Tq I dr + {x — a)<7o, x € (a, b). 



(36) 



In summary, the pseudocode in Algorithm [T] outlines the procedure we use for evaluation of the grid values 
of the asymptotic expansion w a _ e of order fc, from two given vectors of grid values of the variable coefficients 
r and s. 
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Algorithm 1 Evaluation of the fc-th order asymptotic expansion w a , e for the ODE problem ( 26 1— ( 27 ) with 

variable coefficients r and s. 

Given the grid values of the funcions r, s in (a, b) 

fo, . . . , fk <— taylor_coef (k, r,s) / / compute the Taylor coefficients in equation ([35]) 



w , 



-ode_solve(fo, . . . ,ffe) // compute analytically the ODE solutions (|31| and (32) 



y N <-int_fft(l/s*) 
w in <^expand(u> , ...,w k ) 
w a>e <-compose(w in ,yAr/e) 
Return the grid values of w a ,i 



II integrate 1/s using a zero-mean Fourier series (36) 



/ / evaluate the sums ( 30 ) 



/ / compose the inner expansion with Y = / e 



4.4 Discrete ODE operators 



Here we introduce two operators An and Bn associated with the ODE solvers described in the previous sec- 
tions; use of these operators facilitates the introduction of the overall PDE solvers in Section [6] Briefly, the 
first one of these is the "ODE solution operator" (described in Sections 4.1 through 4.3) which, given coeffi- 
cients, right-hand sides and boundary values, produces approximate grid values u — (un(xi), . . . ,un(xn)Y 
of the solution of the two-point BVP (10)-(11). The operator Bn, amounts, simply, to evaluation of the 
ODE differential operator, for given ODE coefficients, for a given discrete function w. 

More precisely, given the equispaced grid points x±, . . . , xn in the domain (a, b) of the two-point boundary 

(p!,...,p N y G R N and q = (q ll ... 1 q N ) t G R N , of the 
the linear operator (matrix) An 



value problem (10)-(11) and the grid values p 
ODE variable coefficients p and q 



(h,...,fNY g 

point values u = 



= An(p, q) maps the grid values / = 
of the right-hand side / and the Dirichlet data d — (d a , d^f € 1 
un{x\), . . . ,un(xn)Y of the ODE solution as defined in Sections 



2 into the approximate 
through 



4.1 



4.3 



p3JV+2 



i-> u = A N (p,q) 



(37) 



Analogously, the matrix Bn — Bn{p, q) maps the grid values w — (w±, . . . , wnY G into the grid point 



values g = (gi, . . . , <?at) resulting from evaluation of the periodic ODE operator on the left-hand side of (21 ) 



on w, that is to say, 

(w,p,q) eR 3N ^g = B N (p,q) we R N , (38) 

where, letting w G B(a,c) denote the scaled FC(Gram) continuation obtained from the grid values w, we 
have set 



dw 



d w 



9j 



w(x s ) -pix^ — ixj) - q {xj)-^{ Xj ) 



for j = 1, 



Remark 12. Note that, 1) As pointed out in Section 3.2 the endpoints a and b are not required to belong 
to the Fourier-continuation grid {xj}^ =1 (and, in fact, they are chosen not to belong to the spatial grid 
xi, . . . ,xn used in the context of the ODE operators considered in this section), and, 2) The ODE evaluation 
inherent in the operator Bn does not involve the Dirichlet boundary data d a and c4, since neither a nor b 
belong to the spatial grid used. 

Remark 13. Note that the operators An and Bn depend on the interval endpoints a and b, although the 
notations ( 37 )-( 38 ) do not make this dependence explicit. This fact is relevant in the context of Section [6] 
where the endpoints a and b generally change as varying Cartesian lines and corresponding ODE domains 
are used as part of an alternating-direction PDE solution procedure. 



4.5 Boundary projections 

In order to ensure stability and convergence, the PDE solvers described in Section [6] utilize slightly modified 
versions of the ODE operators An and Bn introduced in the previous section; the resulting modified oper- 
ators, which incorporate prescriptions put forth in [fo] and |12j , are denoted by An and Bn- Here we sketch 
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the projection and correction procedures that produce An and Bn from An and Bn', full details in these 
regards can be found in reference [BJ. 

Letting v denote v — An{p, q)(f ', d)* (resp. v = BN{p,q)w), the vector Aat(p, <?)(/, d)* (resp. Bn(p, q)w) 
is defined as the result of an application to the vector v of boundary projections and corrections resulting 
from the following procedure: 



(a) With reference to Section 3.2 construct a vector v p of length N by replacing the left-most and 
right-most values of v = (v%, . . . , vnY by the corresponding values of certain polynomial approximants, 
the "open Gram polynomial projections", that result from application of the FC(Gram) procedure to 
the corresponding values (vi, . . . , Ujy A )' an d (vna-n+Ii ■ ■ ■ ,vnY, respectively. These corrections are 
called "open" as they do not involve the two endpoints of the interval [a, 6]; see [6j. 

(b) Analogously, construct a vector v b of length N by replacing the Aa left-most and Aa right-most values 
of v = (v\, . . . , Vff) by the corresponding values of certain polynomial approximants, the "closed Gram 
polynomial projections" , that result from application of the FC(Gram) procedure to the Aa + 1-tuples 
(d a ,vi, . . . ,Vn a Y and (vn a -n+i, ■ ■ ■ i^JVi^fe)*) respectively, where d a and <4 are user-provided values 
(which, in the context of the PDE solver of Section [6j are given be the PDE boundary values on the 
corresponding horizontal or vertical ADI lines). These corrections are called "closed" as they involve 
the two endpoints of the interval [a, b]; see [6j. 

(c) Using a second-order centered finite difference scheme obtain a discrete solution v = (yi, . . . , vnY of 



the two-point BVP ( 10 )-( 11 1 with right-hand side / — E(f) (see equation 19) and construct the open 



projection v> p of v, using the prescription in item (a) above. 

(d) The vector An{p, q)(f, d)* (resp. Bn{p, q)w) is now produced as the linear combination (1 — x)v p + 
\v h + v — u p , where 

X = min{l, — d — i — max q e (x)\ . (39) 

I h z x£(a,c) \ 



(The definition ( 39 ) is a direct generalization of a formula put forth in [BJ for the corresponding constant 



coefficient two-point BVP.) 



5 FC BVP solver II: implementation and numerical results 



This section consists of four subsections, Sections |5.1| through 5.4 which discuss, in turn, 1) A method of 



solution of the linear systems introduced in Sections |4.1|and 4.2 via preconditioned GMRES; 2) Selection of 



parameters needed for the scaled FC(Gram) algorithm described in Section 3.3 3) The accuracy inherent 
in the resulting scaled FC BVP solver; and, 4) The computational cost of the ODE solution algorithm with 
parameters as in point 3). 



5.1 Periodic ODE solution via preconditioned GMRES 



The discretization ( 22 ) of the periodic variable coefficient ODE \2\\ gives rise to the full linear system of 
equations 

B N {p,q)w = g (40) 



associated with the matrix Bpf(p,q) defined in equation (38). The solution of such linear systems can be 
obtained efficiently by relying on the iterative linear algebra solver GMRES [TB], on the basis of 1) FFT- 
based evaluation of forward maps given by the matrix Sjv(p,g), and 2) Use of a finite-difference based 
preconditioner — as described in what follows. 

The pseudocode in Algorithm [2] outlines our algorithm for evaluation of the operator Bn(p, q)w for a 
given vector w of grid values. Assuming that the point values p and q of the variable coefficients have 
been obtained a priori, the ODE evaluation involves an application of the scaled FC(Gram) algorithm to 
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w followed by computation of the point values of its first and second derivatives and multiplication by the 
variable coefficients. In order to mitigate aliasing effects in evaluation of products, oversampling by a factor 
of two (obtained from an adequately zero-padded FFTs) is used in the precomputed coefficients p and q as 
well as the function values and derivatives arising from the vector w. 



Algorithm 2 Evaluation of the function g = Bn(p, q)w: application of the ODE operator for a vector w 
of function grid values 



Given the function grid values w 
w 4— scalecLf cgram(w;) 
iv «-fft(u>) 
g <— over sample (w) 
{9,9") ^-fourier.diff (g) 
{g',g")^li±t{g',g") 

9<-g-pg' - qg" 

Return the grid values of g 



II apply the scaled FC(Gram) procedure 
// apply the Fourier transform 

/ / oversample the Fourier series (by zero-padding) by a factor of two 
/ / compute Fourier space derivatives 
// evaluate derivatives in physical space 
/ / evaluate the ODE operator 



As mentioned above, our algorithm obtains the solutions of the linear systems (40) by means of a precon- 
ditioned GMRES solver. (A direct application of the un-preconditioned GMRES algorithm to these linear 
systems requires extremely large numbers of iterations to meet even modest accuracy tolerances: in the case 
of the two-point BVP considered in Figure [l] for example, the numbers of iterations required to obtain a 
residual of 1(T 10 for N = 100, 1000 and 3000 are 145, 1279 and 3799, respectively.) The preconditioners 
used, which are mentioned in point 2) above, are given by inverses of the matrices corresponding to (pos- 



sibly oversampled) second-order finite difference (FD) approximations of the periodic ODE problem (21) 
(cf . reference [TS] and associated comments in Section [I]) . Since the ODE variable coefficients and the 
right-hand side in (10) are accurately represented by their scaled FC(Gram) continuations, a zero-padding 



procedure can be used to evaluate the ODE coefficients and right-hand side on a spatial grid that is finer, by 



a certain factor A ovor , than the one used in the original Fourier collocation discretization (22), thus easily 
leading to an oversampled preconditioner. With limited impact on the computing cost of the algorithm (see 
Section [5^4| ) , a fine grid FD discretization can improve significantly, by a factor of 1/N^ ver , the accuracy of 
the preconditioner — and thus, as demonstrated in Figure [l] its preconditioning capability. 

Figure [l] displays the number of iterations required by the GMRES-based preconditioned iterative ODE 
solver described above in this section to reach a residual tolerance io/oMRES; f° r various values of the 
preconditioner oversampling parameter A over . For these examples a two-point boundary value problem 
(10)-|lT|) was considered in the interval (a, b) = (0, 1) with coefficients given by p(x) = 24x/(l + Ax 2 ) and 



q(x) — (l + 8a; 3 )/(l-|-4a; 2 ), and with right-hand side and the Dirichlet boundary values such that the function 
u(x) = cos(x 2 + 2) is an exact solution, and the FC parameters iV<j = [26(1 + (N— 21) /100)J (see Section 5.2 ), 
= 10 and m — 5 were used. 

These images demonstrate the GMRES convergence character of the preconditioned FC-based periodic 
ODE solver: the iteration numbers remain bounded independently of the grid size, and they decrease as more 
accurate preconditioners are used. For comparison purposes, results obtained through preconditioning by 
means of the constant coefficient ODE solver [6| (with constant coefficients equal to the average of the given 
variable coefficients) are shown, the corresponding results are marked as "Const, coef." in Figure [TJ Clearly, 
in both plots, the constant-coefficient preconditioner is less efficient than the oversampled second-order FD 
preconditioners. 



5.2 Parameter selection for the scaled FC(Gram) algorithm of Section 3.3 



The exterior-source procedure for enforcement of boundary conditions introduced in Section |4.2[ requires 
evaluation of two right-hand sides, g a , € B(a, c), with support outside the interval (a, b) (see Section [4.2[ ). 
Plots in Figure |2] show our selections for the functions g a and <?& with — 26 (left) and Nd = 260 (right), 
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Figure 1: Number of GMRES iterations required for the solution of the linear system (22) as a function of 
the number N of grid points in the one-dimensional mesh, using the GMRES residual tolerances foZcMRES = 
10~ 15 (left) and 10~ 10 (right), for several finite-difference preconditioning oversampling numbers N ovel and 
a for a simple preconditioner based on a constant-coefficient approximation. 



taking in both cases an TV = 1000 point discretization of the interval (a, 6). As demonstrated on the left 
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Figure 2: Auxiliary right-hand sides g a , gi, used in the exterior-source procedure for = 26 (left) and 
N d = 260 (right). 



portion of Figure [2j if the number Nd of grid points outside the interval (a, b) is fixed, then the size of the 
support of the functions g a and gb tends to zero and, thus, although the maximum norm of these functions 
equal one for all values of Nd, both functions tend to zero in the root-mean-square norm as N — > oo. Thus, 



for fixed N d , the solutions u a and u& introduced in Section 4.2 and thus the matrices of the systems (23), 
tend to zero as N — > oo. Clearly, then, the exterior-source method put forth in Section 4.2 requires that 
Nd —> oo at least linearly with N as N — > oo. 



The scaled FC(Gram) method introduced in Section 3.3 was designed precisely to allow for evaluation 
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of Fourier continuation functions within a context that includes use of variable values of the parameter Nd 
while still allowing for re-use of the basic matching functions {/evenl^Lo anc ^ {/oddi^lo defined in Section 



3.2 



The upper and lower left portions of Figure [3] display the un-scaled and scaled continuations of the function 
f(x) — x 2 cos(x 2 ) using 

N d = [26(1 + (JV - 21)/100)J, (41) 

where [s\ denote the largest integer less than or equal to s. The right portion of Figure [3] demonstrates the 
fact that, provided values of Nd proportional to N are used (as is done here, according to equation (41)), 
the relative error of the scaled FC continuations (see Remark 14 1 decays at the same rate as its non-scaled 
counterpart as N tend to infinity. Although the scaled version is somewhat less accurate than the original 
FC(Gram) continuations proposed in [6], the scaling to the interval [1,1+ g] implicit in equation (41 ) (see 
Section 3.3 1 can be implemented through use of a small set of precomputed scaled matching functions. 

Remark 14. Throughout the present Section [5j the relative error of scaled and unsealed FC(Gram) continu- 
ations of given functions has been evaluated as the difference between the corresponding FC approximation 
and the corresponding exact function values over an equispaced grid ten times finer than the grid used in the 
FC procedure. (We have checked that, for the functions under consideration, use of higher grid refinement 
rates leads to essentially unchanged error estimates.) Clearly it is necessary to use such oversampling, as, 
although not exactly interpolatory, the FC procedure does arise from a least-squares approximation of func- 
tion values (at collocation points next to each one of the interval endpoints), and, thus, evaluation of 
errors solely at collocation points tend to produce error under-estimates. It is reasonable to expect (and we 
have verified in practice) that this difficulty does not arise when FC solutions of ODEs are concerned, since 
the FC BVP solution procedure does not seek to minimize error in function values at the collocation points. 
In spite of this fact, and for consistency, the solution errors presented in Figures [4] and [5] were evaluated on 
a grid resulting from ten-fold refinement. 

Remark 15. The right-hand image in Figure [3] demonstrates that the order of convergence of the scaled 
FC(Gram) method is consistent with the order obtained by the original un-scaled procedure, in this case 
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Figure 3: Fourier continuation of the function f(x) — x 2 cos(x 2 ) in the interval (0, 1) as produced by the 
original FC(Gram) method with Nd — 26 (upper-left), its scaled version using Nd = 260 (lower-left), and 
the corresponding maximum relative errors (relative to the maximum value of the original function) as a 
function of N (right). As demonstrated in Figure |4j the scaled algorithm is an essential element of the 
two-point boundary value solver introduced in Section |4j 
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1/iV 6 , and that both algorithms reach round-off machine error levels for approximately the same values of 
N. However, it is clear from this figure that the convergence of the scaled FC(Gram) algorithm is somewhat 
more irregular than that of the unsealed procedure. The fluctuating behavior observed in Figure [3] and 
accompanying accuracy loss (that amounts to as much as one digit for some values of N in this example, 
but does not otherwise detract from the overall convergence rate), can be traced to the fact that the scaled 



algorithm incorporates the composition (17)-(18) and, thus, associated frequency content, as well as the 



discontinuous selection (41 ) of the number of continuation points as a function of N. 



5.3 Accuracy of FC BVP solver: scaling and asymptotics 
5.3.1 Scaling and convergence of the exterior-source FC BVP solver 

This section demonstrates the effectiveness of the scaling procedure described in Section [33] with parameters 
as indicated in Sections |5.1| and |5.2| for the purposes of this demonstration the two-point boundary value 
problem introduced in Section |5.1| is considered. The left portion of Figure [4] displays in a solid blue line the 
maximum ODE solution error (relative to the maximum value of the ODE solution) based on use of the un- 
sealed FC(Gram) algorithm with a continuation region containing a fixed number = 26 of discretization 
points. For comparison purposes, the left-hand figure also displays the corresponding maximum relative 
error arising in the Fourier continuation of the ODE right-hand side using the same continuation region. 
Clearly, while use of a continuation region that does not grow with N does not affect the accuracy of the 
the FC(Gram) procedure, it does affects greatly the accuracy of the ODE solver: convergence beyond rather 
small values of N is not observed for the un-scaled ODE solution. However, as demonstrated in the right 
portion of Figure [4] use of the scaled FC(Gram) method (with, e.g., given by equation (|41|)), restores 
convergence to machine precision. Notice that the error in the ODE solver is a quantity of order (1 / N) m+1 , 
where m is the maximum polynomial degree in the Gram polynomial basis used for the scaled FC(Gram) 
procedure. 



-ODE error 

-scaled FC(Gram) error 




Figure 4: Maximum error (relative to the solution maximum) in the FC(Gram) approximation of the ODE 
right-hand side (denoted by "FC(Gram) error" in the figure) and corresponding FC ODE solution error for 
the two-point boundary value problem introduced in Section [O] Left: un-scaled procedure. Right: scaled 
version. 



Figure[5]demonstrates the dependence of the numerical accuracy on the preconditioner used. For N < 700 
all the preconditioners considered give rise to similar performance, but larger value of the oversampling 
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Figure 5: Maximum error (relative to the solution maximum) in the FC solution of the two-point boundary 
value problem considered in Section |5.3.1[ as a function of the number of discretization points used, for two 
values of the GMRES residual tolerance: toZcMRES = 10~ 15 (left) and 10~ 10 (right). 



parameter N ovcr do give rise to improved accuracies for the 10 10 GMRES tolerance and for N > 1000. 

5.3.2 Convergence for stiff problems: exterior-sources and asymptotic-matching 

This section demonstrates that an adequate combination of the scaling and asymptotic methods described in 
Sections |4.2| and |4.3| gives rise to a robust and accurate solver for arbitrarily stiff two-point boundary value 
problems under consideration (see Remark |10[) . For this demonstration we consider the variable-coefficient 



two-point BVP ( 10 )-( 11 1 in the interval (0, 1) with / = and with 



.2 2(1 + 21) ^ A , , (l + 2x) 2 



p(x) = e —— — — and q{x) = e 



0.51og(l + 2a;) + l Hy ' 0.51og(l + 2x) + 1 ' 

The exact solution of this problems is given by 

u{x) = Cx{e)M fe-s Q log(2x + 1) + 1 jj + C 2 (e)Bi hH Qlog(2x+l) + l 

where Ai and Bi are the Airy functions of first and second kind, respectively, and where C\{e) and C^e) 
are chosen in such a way that u(0) = 1 and u(l) = 0. The solution u has a boundary layer at the endpoint 
x = for small values of e. 

The left portion of Figure [6] displays the boundary layer solution u(x) for three values of e. The right 
image in the same figure presents the maximum relative errors that result from use of the exterior-source 



procedure (Section 4.2) and asymptotic-matching expansions (Section 4.3) of orders k = 1, 2 and 3. As 
expected, the exterior-source and asymptotic-matching procedures are both accurate within their intended 
realms of applicability (e bounded away from zero and e — > 0, respectively), but they both break down 
otherwise. Hence, our algorithmic strategy, which relies on one procedure or the other, depending on the 
value of e, results in accurate approximations for all values of e > 0. 

Remark 16. The adequate detection of the boundary layer in terms of the small parameter e, and, thus, the 
evaluation of the threshold value that defines the limit between the (h, e) regions for which the exterior-source 
and asymptotic-matching procedures are applied depends on each particular BVP. As a rule of thumb the 
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Figure 6: Boundary layer solution with small parameters e = 0.1, 0.01, 0.001 (left), and maximum relative 
errors, as a function of e, resulting from use of the asymptotic- matching expansions of order k = 1, 2, 3 as 
well as the exterior-source method with h — 10~ 2 (right). The dash-dot vertical line on the right plot is 
located at e = h = 10 -2 . 



threshold limit can be fixed to e = h (see Figure |6j). This selection (with e = \J At/2 and e = At for the 
heat and wave equations, respectively) was used in all of the numerical experiments presented in this paper. 
Section |7.2| and, in particular, Table [l] provide an indication of the computing costs required by the dual 
exterior-source/asymptotic-matching boundary-condition strategy we use. 



5.4 Computational cost of the FC BVP solver 

The computational cost of the proposed FC-AD method for the solution of time-dependent problems depends 
linearly on the cost of the evaluations of the ODE solution and evaluation mappings An and Bn defined 



in (37) and (38 1, respectively. Thus, an efficient numerical implementation of these operators translate into 
corresponding efficiencies for the resulting overall FC-AD time-marching scheme. 

The evaluation of the mapping An comprises two main components, namely 1) Fourier continuation (as 



described in Section 3.3 1 of the variable coefficients and the right-hand side, and 2) Numerical solution of the 



resulting linear system (22). In view of Remark|4j point 1) requires O(NlogN) operations. With regards to 
point 2), on the other hand, we note that every iteration of the GMRES method involves one evaluation of 
the discrete ODE operator Bn and one evaluation of the preconditioner. Inspection of Algorithm [2] and the 
Finite Difference preconditioning algorithm presented in Section [5 . 1 1 therefore shows that the overall cost of 
the of the ODE solver is 2FFT(iV + N d -1)+ 7V itcr FFT(7V ovcr (iV + N d - 1)) + 47V itcr FFT(2(7V +N d - 1)) + 
0(N OVCI (N + Nd — 1)), where FFT(M) denotes the number of operations required to evaluate an FFT of 
size M. In brief, the ODE solver runs in Ni tC iO(N log N) operations. 



6 Full FC-based PDE solver 

Based on algorithms introduced above in this text, this section introduces FC-based alternating-direction 
solvers for diffusion and wave propagation PDEs with variable coefficients. 
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6.1 Overall spatial discretization 



Let fl be a two-dimensional spatial domain with a piecewise smooth boundary which, without lost of general- 
ity, we assume is contained in the rectangle R = [0, L H ] x [0, L v ]. In order to produce a spatial discretization 
of f2, the rectangle R itself is discretized by means of a uniform Cartesian grid il/j given by 



n n {( Xi , Vj ) = ((* - i)h H , (j - i)h v ), i = i, . . . , m h , j = i,..., m v ], 



where, for some integers M H ,M V > 1 we have set h H = L H /(M H - 1) and h v = L V /(M V - 1). For the 
sake of simplicity we assume that each grid line only crosses the boundary dil twice, and we denote by 



{af,bf} = {x:(x, yj )£dSl} 
{aY,bY} = {y:(x i ,y)edn} 



(i = i,- 

(i=l,.. 



,M V ), 



and 



the points of intersection of dtt with horizontal and vertical Cartesian lines (shown as green circles and yellow 
squares in Figure [7]), respectively. (Generalization to cases for which more than two intersections occur for 
some Cartesian lines is straightforward.) The horizontal and vertical discretization lines within f2 and the 
corresponding sets of indexes, in turn, are given by 



P j 
P} 



H 



{(x,yj) e n h } 



-{ieN:{x l ,y j )en h } (j = 1, 
{j GN : (xi,yj) G Q h } (« = !,. 



.,M V ), 
,M H ). 



and 



Each alternating-direction half-step in the time-marching algorithm presented in Section 2.2 consists of 
a horizontal sweep (steps (D2) and (W2)) and a vertical sweep (steps (D4) and (W3)). For each horizontal 
(resp. vertical) sweep, the solver requires solution of a Dirichlet two-point BVP along each vertical (resp. 
horizontal) line PY (resp. Pj 1 )- The solution of each Dirichlet two-point BVP, in turn, involves application 



of the operators (38) (evaluation of the ODE right-hand side) and (37) (BVP solution) (see also Section 4.5 



and Remarks 12 and |13|). Sections 6.2 and 6.3 provide a detailed description of the discrete versions of the 
horizontal and vertical sweeps, for the problems of diffusion and wave propagation, in terms of the operators 
An and B^. 
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Figure 7: Sweeping procedure used for evaluation of the grid values 2 on vertical lines (magenta 
"+" crosses on the left) and the grid values it™ +1 on horizontal lines (red "x" crosses on the right). 
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Remark 17. Throughout Section |6| a bold-face symbol such as 4>i,j denotes the value of a grid function 
at a point (xi,yj) G fi/j at time t n , and for each fixed i (resp. for each fixed j), we define the vector 
</>". — ((f>™j)j eI y (resp. cf) n j = {4>i,j)iei H )- In addition, in the case of the diffusion and wave problems, with 



reference to ( 13 1 and il4h respectively we set 



pfj = ^"{xi^j), pYj = V v (x l , yj), q hj = Q(x h yj), and /;' , = F{x it yj,t n ). 
Since originally only the grid values of a and ft are known, the grid values of d x /3 and d y /3 needed to evaluate 



equation V H and V v in ( 13 1 and ( 14 ) are approximated at each line by the derivative of its scaled FC(Gram) 
continuation. 



6.2 FC-AD diffusion solver 



Taking into account Sections 2.2.1 4.5 and 6.1 (and, in particular, Remark 17 concerning bold-face symbols 



denoting grid functions and vectors of grid function values), our full FC-based ADI discrete procedure for 
the diffusion problem consists of the following four steps: 



(D1)jv Initialize u^j and w®^ on the vertical lines PY (j G I 



v 



1,3 



1. 




= B N {-pY,-q ti .)u9 



M 



and, for n = 0, 1, 



(D2)tv compute u™~^ 2 on the vertical lines (j G iY , i = 1, . . . , M H ), 



{ 9 n+ HbY) ) 



(D3)jv compute w^ 2 on the horizontal lines Pf (i G If , j = 1, .. . ,M 



1,3 



w 



1,3 



2u 



j 

n+i 
id 



w 



(D4)jv compute u™^ 1 on the horizontal lines Pf (i G If , j = 1, 



/ n+l 



trt 1 =A N (p H v q.A 



,M V ) 

.71+ 



\ 



f +1 {af) 



l n+1 (bf) ) 



Note that the initialization step (Dl)jy only requires no more than M H evaluations of the mapping Bjy, 
and every time step (D2jv)-(D4at) involves no more than M H + M v evaluations of the mapping Am — in 
other words, each time step requires no more than M H + M v solutions of one-dimensional two-point BVP 
with Dirichlet boundary conditions. The computational cost of these ODE solvers is analysed in Section [5T] 
The cost of the overall diffusion PDE solver (which, in brief, runs at FFT speeds), is illustrated in Tables [l] 
and El 
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6.3 FC-AD wave propagation solver 



Taking into account Sections |2.2.2||4~5"1 and |6.l| as well as Remark[l7j our full FC-based ADI discrete procedure 
for the wave propagation problem consists of the following three steps: 

(W1)jv Initialize and uj j on the vertical lines PV (j 6/f, i = 1, . . . , M ff ), 

" —u (xi,yj), 
,.i - uo(xi,yj) + Atui(xi,yj), 




and, for n = 0, 1, . . . , n max , 
(W2) jv compute t«"t 2 on the vertical lines ( j G P7 , 



1, 



/2m™ 



,Af 



?" +1 K y ) 



) 



(W3)jv compute u™^ 1 on the horizontal lines Pf (i € P?, j = 1, 



" +1 (af 



Every time step (W2jv)-(W3jv) in the present wave equation algorithm requires at most M H + M v 
evaluations of the mapping and thus, in view of Section |5.4[ the overall FC-based wave propagation 
solver runs at FFT speeds. 



7 Numerical results 

This section presents a variety of numerical results demonstrating the accuracy, unconditional stability, 
reduced computational cost and spatial dispersionlesness of the variable-coefficient FC-AD algorithms intro- 
duced in this paper. Implementation details and hardware setup used include the following: 

1. All numerical simulations presented in this section have resulted from Fortran implementations of 
the FC-AD algorithms introduced in previous sections, running on a single processor Intel® Xeon® 
(model X5570) at 2.93 GHz with 8MB cache size. 



2. In accordance with Sections |5.1| and |5.2[ the numerical simulations presented in this section use the 
FC(Gram) parameters N A = 10 and N d = [26(1 + (N - 21)/100)J, and, following 0Q1], the values 
7ti = 5 and 4 for PDE solver of the diffusion and the wave model, respectively. (In agreement with those 
references we have found that these values of m ensure unconditional stability of the FC-AD algorithm.) 



In all cases the oversampling parameter used for the GMRES prcconditioncr (Section 5.1) is set to 
A?over = 4, and boundary conditions are enforced by means of the hybrid exterior-source/asymptotic- 



matching algorithm (see Remark 16 ) 



3. At each time step t n the solution is stored as a matrix (u™^ -1 ) of size M H x M v containing the 
approximate solution values at (xi,yj) € Qh and zeroes for (xi,yj) (jL Slh). 
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4. Since the forward operator Bn and the BVP solver An use periodic extensions of the original problem, 
the Fortran implementation of both procedures for each horizontal and vertical grid lines use work 
vectors with more entries than the number of points supported on lines P? 1 and Pj of ilh- For each 
half-time step ((D2)n, (W2)jv, (D4)jv and (W3)jv) and each associated horizontal and vertical line, 
only the output quantities for indices in the corresponding sets I? and ij (that is, for the corresponding 
points in the computational domain O^) are stored in the matrices (lif^). 

5. All needed FFTs are performed using the FFTW library [10]. In addition, FFTW have been used to 
transpose in-place the matrices to preserve the contiguous memory access (column-wise in the 
Fortran implementation) prior to the needed transfers of horizontal or vertical lines to the BVP solver 
work vector mentioned in point [4] 



7.1 Boundary conditions: exterior-source and asymptotic-matching procedures 

As indicated in Remark [T6j our variable-coefficient FC-AD algorithm automatically selects the mechanism — 
either exterior-sources or asymptotic-matching — for enforcement of boundary conditions. In view of the 



discussion in Section 5.3.2 it is clear that the hybrid exterior-source/asymptotic-matching procedure can be 
used to produce accurate solutions for arbitrary time-steps in computing times per time-step that remain 
bounded as At — ¥ 0. The present section, in turn, presents numerical results that demonstrate quantitatively 
the impact of the hybrid approach, in terms of computing time and accuracy, on the solution of full PDE 
problems. 

Table [l] which presents computing times required by the FC-AD diffusion solver for 1000 x 1000 two- 
dimensional grid and for two different values of the time step At (in the particular case of the first diffusion 



problem considered in Section 7.2 see also Figure |9j), provides some insight into the computing costs required 
by the hybrid approach in each of the two possible (h, At) regimes. The "Setup" columns in this and 
subsequent tables display the overall time used in precomputations — including each one of the following 
operations for each horizontal and vertical line Pj 1 and P^ : 1) Precomputation of FFTW plans [10] in 
real- valued arithmetic; 2) Evaluation of the scaled FC(Gram) continuations for the variable coefficients and 



the right-hand side; 3) LU factorization of the preconditioning finite-difference matrix (see Section 5.1| 



4) Evaluation of the auxiliary solutions for treatment of boundary conditions (see Section 4.2 and 4.3); and 

5) Initialization of the initial time step. As can be gleaned from Table [Tj for the diffusion problem under 
consideration, the FC-AD asymptotic-matching procedure leads to somewhat smaller overall setup times 
but comparable times per time-step as the corresponding exterior-source method; similar remarks apply to 
our FC-AD implementation of the variable-coefficient wave propagation problem. Thus, the asymptotic- 
matching method resolves the boundary layers that arise for small values of At (which cannot be accurately 
discretized by the exterior-source method, unless unduly fine grids are used) at a cost comparable to that 
which would be required by the exterior-source method in absence of boundary layers. The exterior-source 
method, in turn, can adequately treat cases in which no significant boundary layers exist — for which the 
asymptotic-matching method would be inaccurate; cf. Figure [6] and [8] 

To demonstrate, in a simple context, the hybrid exterior-source/asymptotic-matching procedure for en- 
forcement of boundary conditions (Sections |2.2.2 4.2 and 4.3 ), here we apply the hybrid algorithm in conjunc- 



tion with a one-dimensional FC-based time marching scheme to a one-dimensional wave-propagation problem: 
the one-dimensional wave equation Q with variable coefficients a(x) = 1 + 4x 2 and j3{x) — 2 — x + 8x 2 
in the unit interval, and with Dirichlet boundary conditions such that the exact solution is given by the 
oscillatory function u(x,t) — sin(100x — 2nt). Figure [8] displays the resulting maximum relative solution 
error as a function of the time step At and the grid size h. The left error map in this figure presents the 
relative errors that result as the exterior-source procedure is used for all values of (h, At). As expected, in 
presence of boundary layers (that arise for small values of At) , the overall numerical approximation provided 
by the exterior-source algorithm is completely inaccurate. The right-hand error map in Figure [HJ on the other 
hand, displays the errors that result from the hybrid boundary-conditions algorithm. The dashed line in the 
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Boundary condition 




to/GMRES = 10 15 


toZ GM RES = 10" 10 


fo^GMRES = 10 6 


enforcement 


-^ovcr 


Setup 


Time step 


Setup 


Time step 


Setup 


Time step 


h = 1(T 3 , 


1 


13.493 


6.169 


5.617 


2.265 


4.264 


1.387 


At = 5 x 1CT 3 , 


2 


8.119 


3.576 


5.693 


2.129 


4.943 


1.480 


(exterior-sources 


4 


8.628 


3.788 


6.343 


2.332 


5.408 


1.816 


regime) 


8 


11.700 


5.189 


8.816 


3.273 


7.532 


2.548 


h = 10^3, 


1 


3.039 


6.301 


3.005 


3.339 


3.041 


1.478 


At = 1CT 6 , 


2 


3.182 


5.976 


3.142 


2.474 


3.180 


1.626 


(asymptotic-matching 


4 


4.175 


6.326 


4.147 


2.785 


4.183 


1.848 


regime) 


8 


5.499 


8.275 


5.456 


3.630 


5.496 


2.585 



Table 1: CPU time (in seconds) required by our implementation of the FC-AD method, for the variable- 
coefficient diffusion problem mentioned in Section |7.1| in two different regimes of the boundary condition 
enforcement algorithm. 



right-hand map separates regimes in the hybrid approach: above this line the exterior-source algorithm was 
utilized, below this line the asymptotic-matching method was used. Close consideration of the right-hand 
error map shows that the hybrid boundary-conditions algorithm leads to an overall convergent PDE solver. 
In the right-most region of the right-hand error map (larger values of h), the error, which results mostly from 
the coarseness of the spatial FC discretization, is essentially independent of At. In the left-most portion of 
the map, where finer spatial grids are used, the spatial discrete errors are negligible in comparison with the 
time discretization errors, and hence the resulting errors depends only on the time-step value At. 




h 



Figure 8: Maximum relative error in the FC-AD approximate solution of a one dimensional wave equation 
of the form ([2| using solely the exterior-source procedure (left) and using the hybrid boundary-conditions 
algorithm (right). 



7.2 Diffusion problem: performance, convergence and stability 

A variety of numerical examples presented in this section demonstrate the character of the FC-AD scheme for 
diffusion problems with variable coefficients. For the first set of tests of this section we consider a problem of 
the form ([I]) with variable coefficients given by a(x, y) = x + y + 1 and (3(x, y) = 2x + 0.5y + 1 in the domain 



25 



bounded by the curve (x/9) 6 + (y/5) 6 = (1/20) 6 . The right-hand side and the boundary conditions have been 
selected in such a way that the function u(x, y, t) = sin(7r(3iz; 2 + 2?/ 2 + 2i;)) is the exact solution of the problem. 
For our At convergence studies the FC-AD numerical solution is produced up to the final time T = 0.1. The 
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Figure 9: Left: Maximum relative errors in the FC-AD approximate solution of the first diffusion problem 
mentioned in Section [7.2[ as a function of the time step At, for three different spatial grids. Right: Compar- 
ison of these solution errors with those obtained, for the same problem, via an application of the Richardson 
extrapolation method leading to two orders of improvement in the temporal convergence rate. 



spatial discretizations are given by uniform grids of size h within the square [—0.5, 0.5] x [—0.3, 0.3], whereas 
the preconditioner uses the oversampling ratio N OVCI = 4 and the GMRES tolerance toZcMRES = 10~ 10 . 

Figure [9] displays maximum relative errors throughout Qh (evaluated through comparison with the exact 
solution) produced by the second-order in time FC-AD algorithm described in Section 6.2 for h = 0.01, 
0.005, and 0.001 (left plot in Figure [9]), as well as corresponding results obtained by means an additional 
application of the Richardson extrapolation procedure (right plot in Figure [9j see [5] and references therein 
for details on the the application of the Richardson extrapolation method in the time domain). Since the 
spatial discretization errors for different grids are negligible with respect to the time discretization (at least 
for the the coarser time steps) , the relative error exhibits the expected second-order convergence that results 
from the ADI time-marching scheme described in Section |2.2| Additionally, the convergence displayed in 
this figure demonstrates that the underlying solver is not subject to the ordinary CFL constraint At ~ h 2 
required by explicit solvers: for the case h = 0.001 the largest At values used here are in fact four orders 
of magnitude larger than would be allowed by the quadratic CFL constraint. (In fact, our experiments 
suggest that At can be increased arbitrarily without leading to instability: values as large as At = 100 and 
At = 1000, etc, lead to stable, albeit inaccurate solutions.) 

To demonstrate the performance of the FC-AD method for general geometries we consider the curved, 



non-convex heating circuit structure depicted in Figure 10 The variable heat-equation coefficients used 
correspond to (variable) thermal constants of silicon; the geometry, in turn, represents a 80 x 120 rectangular 
plate containing a curved heating circuit. The thermal conductivity varies from 5 to 50 (the typical range 



for silicon) as shown in the left plot of Figure 10 The initial temperature has been fixed to uo = 55. 
The temperature profile on the exterior boundaries of the plate is fixed also to 55, whereas that the inner 
boundaries of the heating circuit are driven by the function g(x, y, t) = 55 + 30sin(207rf). 

The right plot in Figure [l0| presents the temperature field at time T = 0.1 — that is, one period of the 
boundary data function g — evaluated by means of the FC-AD algorithm described in Section 6.2 (without 
Richardson extrapolation, for simplicity) on a spatial grid containing 800 x 1300 discretization points. The 
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Figure 10: Variable thermal conductivity (left) and temperature field produced by the FC-AD method at 
time T = 0.1 (right) for the second diffusion problem mentioned in Section 



7.2 



corresponding maximum errors were evaluated through comparison with a reference numerical solution 
produced using At = 2.5 x 10~ 5 on a 1600 x 2600 spatial grid: it was found that the 800 x 1300 solutions 
with At = 10~ 4 and 5 x 10~ 5 and fo^GMRES = 10~ 10 contain maximum errors of 0.054% and 0.013%, 
respectively — demonstrating, in particular, second order convergence in time. Analogous relative errors at 
somewhat faster computing times result from use of the tolerance value toZcMRES = 10 -6 . 







to^GMRES — 10" 


-10 


tol 


GMRES = 10" 


-6 


At 


■^Y over 


Setup Time step 


Total 


Setup 


Time step 


Total 


10" 4 


4 


33.989 2.343 


268.289 


25.798 


2.107 


236.498 


5xl0~ 5 


4 


33.696 2.406 


274.296 


23.944 


1.883 


212.244 


2.5xl0~ 5 


4 


34.095 2.371 


271.195 


23.271 


1.693 


192.571 



Table 2: CPU times (in seconds) required to evolve the heating-circuit FC-AD solver for a total 100 time 
steps on a 800 x 1300 grid for various values of the time step At. 



The CPU times required to obtain the various heating-circuit solutions mentioned above are presented 
in Table [2] for two different values of the GMRES residual tolerance to/cMRES (both of which yield similar 
errors, as indicated previously in this section, for the values the parameters under consideration presently). 



7.3 Wave propagation problem: performance, convergence and stability 

The following two subsections demonstrate the properties of the FC-AD solver for the wave equation with 
variable coefficients. Section |7.3.1| highlights the highly significant performance gains that arise from the 
low spatial dispersion inherent in the FC methodology, including comparisons with finite-difference methods. 



Section 7.3.2 in turn, presents an application of the FC solver to a non-trivial geometry, and it demonstrates 
the convergence and stability of the approach. 
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7.3.1 Spatial dispersionlessness 



We demonstrate the spatial dispersionlessness of our FC wave-equation solvers by means of the one-dimensional 
problem defined by the equation a(x)duu — d x {fi(x)d x u) = with variable coefficients a(x) = 1 + x/2 and 
j3{x) = 2/(2 + x) in the spatial interval (a, b) = (0, 1), with final time T = 1, and with Dirichlet boundary 
conditions such that the exact solution is given by u(x) = sin(27r/(a; 2 /4 + x + £)). The FC-based time 
marching scheme for the present one-dimensional case is analogous to that presented in Section [2.2.2| for two 
dimensions but, of course, in here use of alternating directions is neither possible nor necessary. In order 
to focus attention on the spatial dispersion properties of the algorithm, our first example in this section 
uses a fixed time discretization which gives rise to errors smaller than all the corresponding spatial errors: 
At = 8 x 10~ 7 used in conjunction with second order Richardson extrapolation. The spatial grids used, 
in turn, are frequency dependent: they are taken to hold a prescribed number of points per wavelength 
(PPW) . Since the coefficients and thus the wavelength vary within the physical domain, the PPW quantity 
is defined as the number of discretization points used within the shortest wavelength — which, in our case, 
can be defined as the minimum value of the "wavelength function" min{l/(/(x/2 + l)) : x S [0, 1]} = 2/(3/). 




Figure 11: Maximum relative errors in numerical solutions of the one-dimensional wave equation in the time 
interval [0, 1] using a fixed number of Points-Per- Wavelength (PPW). Left: one-dimensional finite difference 
solver. Right: one-dimensional FC-AD solver. 



Figure [TTJ displays maximum errors relative to the maximum solution values for numerical solutions 
produced by two spatial differentiation methods, namely second-order finite differences (left portion of Fig- 
ure 11) and Fourier continuation (right portion of Figure 11). The time discretization in both cases is the 



one described above: second-order Richardson extrapolation with At = 8 x 10~ 7 . It can be seen from these 
figures that, even at the lowest frequencies (lowest number of wavelengths in the domain (0, 1)), the FC so- 
lutions for 15 and 20 PPW are significantly more accurate than the corresponding finite-difference solutions, 
and, for such low frequencies, only the finite-difference solutions using 60 and 80 PPW approximately match 
the error in the 15 and 20 PPW FC solutions. As the frequency increases the numerical error in the FC 
solutions remains essentially constant, that is, the FC solver is essentially spatially dispersionless. The finite 
difference scheme, on the other hand, is not: numerical errors increase significantly with frequency, and an 
80 PPW mesh can only produce 1% accurate solutions for problems containing no more than five PPW. 
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7.3.2 Wave propagation problem: complex structures 

This section demonstrates the properties of the overall FC-AD scheme for problems of wave motion with 
variable coefficients. In our first example we consider the wave propagation problem ^ with variable 
coefficients given by a(x,y) = 1 + x + y, f3(x,y) = 2. Ox + 0.5y + 1 in the domain bounded by the curve 
(x/9) 6 + (y/5) 6 = (1/20) 6 . The PDE right-hand side and boundary conditions are selected in such a way 
that the function u(x,y,t) = sin(7r(x + 2y — t)) is the exact solution of the problem. Uniform meshes of 




Figure 12: Left: Maximum relative errors in the FC-AD approximate solution of the first wave propagation 
problem mentioned in Section [7.3.2[ as a function of the time step At, for three different spatial grids. Right: 
Comparison of these solution errors with those obtained, for the same problem, via an application of the 
Richardson extrapolation method leading to one order of improvement in the temporal convergence rate. 



grid-size h are used to discretize the square [—0.5, 0.5] x [—0.3, 0.3] which contains the PDE domain. For our 
At convergence studies, errors in the FC-AD numerical solution are evaluated at the final time T = 0.1. The 
GMRES tolerance is set to toZcMRES = 10~ 10 , and the preconditioner uses the oversampling ratio N oveI = 4. 

Figure 12 presents maximum values of the solution error throughout (evaluated through comparison 
with the exact solution) relative to the maximum value of the solution, that results from use of the FC-AD 
algorithm described in Section 2.2.2 for h — 0.01, 0.005, and 0.001 (left plot), as well as corresponding results 
obtained by means an additional application of the Richardson extrapolation procedure (right plot). Since 
the spatial discretization errors for different grids are negligible with respect to the time discretization, the 
relative error exhibits the expected first and second-order convergence that results from the FC-AD time- 
marching scheme (described in Section 2.2 1 and the application of the Richardson extrapolation procedure, 
respectively. Once again, the unconditional stability of the FC-AD time-marching scheme allows us to 
consider a wide range of time steps, without CFL-type restrictions. Notice that, as a result of the hybrid 
exterior-source/asymptotic-matching procedure for enforcement of boundary conditions, small values of the 
time-step At (and the associated boundary layers) do not give rise to accuracy losses even when the coarse 
mesh-size h — 0.01 is used. 

To illustrate the applicability of the FC-AD wave solver for PDEs with general spatially variable co- 
efficients, we consider the waveguide depicted in the left portion of Figure |13| which consists of a pair of 
curved channels within a rectangular plate of dimensions 1x2 with chamfered corners. With reference to 
equation ([2|, the material is characterized by variable coefficients a and (3 where a varies between 1 and 
20 as depicted in the left portion of Figure [l3j and where (3 = 1. The structure is excited by a spatially 
Gaussian harmonic pulsed source of frequency / = 3n supported in a disk of radius 0.05 centered at point 
(0.5,1). The solution at the final time T = 4, which is displayed on the right portion of Figure 13 was 
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obtained using At = 10~ 2 on a 200 x 400 spatial grid and toZcMRES = 10~ 6 in a total computational time 
of 194.147 seconds (requiring 18.147 for the setup and 0.440 seconds per time-step). Using, for reference, a 
solution computed via a 400 x 800 spatial grid and At = 2.5 x 10~ 3 , it was found that the solution above 
contains an error of 0.002%. 



Conclusions 

We have introduced Fourier-based alternating direction time-marching schemes for the numerical solution 
of linear PDEs with variable coefficients. Following [HJ 112) , our use of the Fourier continuation method in 
combination with the ADI time-marching scheme and the Fast Fourier Transform, gives rise to a highly 
desirable combination of properties, namely, unconditional stability and high-order accuracy and spatial 
dispersionlessness at FFT speeds in the general context of non-periodic functions and for general domains. 
A variety of numerical results demonstrate the properties of the resulting solvers for problems of diffusion 
as well as wave propagation and scattering in media with spatially varying characteristics. 



A An auxiliary lemma 



Lemma 1. Let q l , g a and <?b be smooth functions defined in the interval [b,c], and let q e be strictly positive 



in that interval. If g a and g^ satisfy the conditions ( 25 ) , then the overdetcrmincd ODE problem 



,dv »d 2 v . . 

v-p^z -q -jzs =9a+Mb m(0,c), 



dx 



dx 2 



dx 



dx 



(A.l) 
(A.2) 



is not solvable: equations (A.1)-(A.2| do not admit solutions v for any real value of the constant /i. 



Proof. Assume a solution v of the problem (A.1)-(A.2) exists. Denoting by G(x,t;) the Green function of 
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the problem, 



G{x, - P^G(x, - ?q^G(x, = <f (a!=0 in (6, c), 



G(6,O = G(c,O = 0, 



and letting 



h = g a + H9b, 



(A.3) 



the solution v can be expressed in the form 

i>(:r) 



Taking into account the Neumann boundary conditions (A. 2 1 we then obtain 

* fOG. 
dx 
dv ^ 
dx 



-(b) 
'(c) 



dx 



(ft, = 



— ( c ,oMOde = o. 



(A.4) 
(A.5) 



Now, as is known (see e.g. in [TTJ Ch. V.28]), the function dG/d£, satisfies the ODE problems 



and 



dG 

d£ 
dG 
~dj 

dG 
dG 



(x,b) -p 



_ d (dG 



-(x,b) 



(M) = 7 



dx \ <9£ 
1_ dG 



d 2 (dG 



dx 2 \ d£ 
(c,6) = 



(x,b) = in (b,c), 



(x,c) -p 



_ d (dG 



(b,c) = 0, 



dx \ <9£ 
dG 



(x,c) 



d 2 (dG 



a , (c,c) = 
d£ 9 (c) 



dx 2 \ dt; 

1 



(x,c) = in (b,c), 



In view of the identity ^r(x, £) = ^r(£, x) (which follows from the symmetry G(x, £) = G(£, x) of the Green 



function) it follows that the function 



<9G 



<9G 



satisfies the two-point boundary-value problem 



dH „d 2 H . . 

H - p— - q -j-^ = in (b, c), 



dx 



dx 2 



1 



H(c) = 



Applying the strong maximum principle 9] to this elliptic equation we obtain the estimate 

H(x) 



dG dG 

— (6, x) - — (c, x) > C > for x € [&, cl, 
ax ax 



where C is the strictly positive constant C — min{l/g (b), 1/q (c)}. From (25), (A.3), (A.4) and (A.5) we 
thus obtain 



= 



dG 



(b,x) 



dG, 



dx ' dx 
which is a contradiction, and the lemma follows. 



(c, x) I h(x) dx > C / (g a (x) + figi,(x)) dx = C / g a (x)dx>0, 



□ 
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